\chapter{Restart Capabilities and Utilities}\label{restart}

\section{Restart Management}\label{restart:management}

Dakota was developed for solving problems that require multiple calls
to computationally expensive simulation codes. In some cases you may
want to conduct the same optimization, but to a tighter final
convergence tolerance. This would be costly if the entire optimization
analysis had to be repeated. Interruptions imposed by computer usage
policies, power outages, and system failures could also result in
costly delays. However, Dakota automatically records the variable and
response data from all function evaluations so that new executions of
Dakota can pick up where previous executions left off.

The Dakota restart file (e.g., \path{dakota.rst}) is written in a
binary format, leveraging the Boost.Serialization library. While the
cross-platform portability may NOT be as general as, say, the XDR standard,
experience has shown it to be a sufficiently portable format to meet
most users needs.  Caution should be exercised to ensure consistent
endianness of the computer architectures involved when attempting to leverage
the restart capability in a multi-host environment.  For example, if a little
endian host is used to create the restart file, it can only be reliably ported
and read on a host that is also little endian.  As shown in
Section~\ref{tutorial:installation:running},
the primary restart commands for Dakota are
\texttt{-read\_restart}, \texttt{-write\_restart}, and \texttt{-stop\_restart}.

To write a restart file using a particular name, the
\texttt{-write\_restart} command line input (may be abbreviated as
\texttt{-w}) is used:
\begin{small}
\begin{verbatim}
    dakota -i dakota.in -write_restart my_restart_file
\end{verbatim}
\end{small}
If no \texttt{-write\_restart} specification is used, then Dakota will
still write a restart file, but using the default name
\path{dakota.rst} instead of a user-specified name.  To turn restart
recording off, the user may select \texttt{deactivate restart\_file}
in the \texttt{interface} specification (refer to the Interface
Commands chapter in the Dakota Reference Manual~\cite{RefMan} for
additional information).  This can increase execution speed and reduce
disk storage requirements, but at the expense of a loss in the ability
to recover and continue a run that terminates prematurely.  Obviously,
this option is not recommended when function evaluations are costly or
prone to failure. Please note that using the \texttt{deactivate restart\_file}
specification will result in a zero length restart file
with the default name \path{dakota.rst}.

To restart Dakota from a restart file, the \texttt{-read\_restart}
command line input (may be abbreviated as \texttt{-r}) is used:
\begin{small}
\begin{verbatim}
    dakota -i dakota.in -read_restart my_restart_file
\end{verbatim}
\end{small}

If no \texttt{-read\_restart} specification is used, then Dakota will
not read restart information from any file (i.e., the default is no
restart processing).

A new Dakota feature (as of version 6.0) is an input file specification block
providing users with additional control in the management of the function
evaluation cache, duplicate evaluation detection, and restart data file entries. 
In the interface’s analysis driver definition, it is possible to provide
additional deactivate parameters in the specification block
(e.g., \texttt{deactivate strict\_cache\_equality}.
It should be noted that, by default, Dakota's
evaluation cache and restart capabilities are based on strict binary equality.  
This provides a performance advantage, as it permits a hash-based data
structure to be used to search the evaluation cache.  The use of the
\texttt{deactivate strict\_cache\_equality} keywords may prevent cache misses,
which can occur when attempting to use a restart file on a machine different
from the one on which it was generated. Specifying those keywords in the Dakota
input file when performing a restart analysis should be considered
judiciously, on a case-by-case basis, since there will be a performance penalty
for the non-hashed evaluation cache lookups for detection of duplicates. That
said, there are situations in which it is desirable to accept the performance
hit of the slower cache lookups (for example a computationally expensive
analysis driver).

If the \texttt{-write\_restart} and \texttt{-read\_restart}
specifications identify the same file (including the case where
\texttt{-write\_restart} is not specified and \texttt{-read\_restart}
identifies \path{dakota.rst}), then new evaluations will be appended
to the existing restart file. If the \texttt{-write\_restart} and
\texttt{-read\_restart} specifications identify different files, then
the evaluations read from the file identified by
\texttt{-read\_restart} are first written to the
\texttt{-write\_restart} file. Any new evaluations are then appended
to the \texttt{-write\_restart} file. In this way, restart operations
can be chained together indefinitely with the assurance that all of
the relevant evaluations are present in the latest restart file.

To read in only a portion of a restart file, the
\texttt{-stop\_restart} control (may be abbreviated as \texttt{-s}) is
used to specify the number of entries to be read from the
database. Note that this integer value corresponds to the restart
record processing counter (as can be seen when using the
\texttt{print} utility; see Section~\ref{restart:utility:print} below),
which may differ from the evaluation numbers used in the previous run
if, for example, any duplicates were detected (since these duplicates
are not recorded in the restart file).  In the case of a
\texttt{-stop\_restart} specification, it is usually desirable to
specify a new restart file using \texttt{-write\_restart} so as to
remove the records of erroneous or corrupted function evaluations. For
example, to read in the first 50 evaluations from
\path{dakota.rst}:
\begin{small}
\begin{verbatim}
    dakota -i dakota.in -r dakota.rst -s 50 -w dakota_new.rst
\end{verbatim}
\end{small}

The \path{dakota_new.rst} file will contain the 50 processed
evaluations from \path{dakota.rst} as well as any new evaluations.
All evaluations following the 50$^{\mathrm{th}}$ in \path{dakota.rst}
have been removed from the latest restart record.

Dakota's restart algorithm relies on its duplicate detection
capabilities. Processing a restart file populates the list of function
evaluations that have been performed. Then, when the study is
restarted, it is started from the beginning (not a ``warm'' start) and
many of the function evaluations requested by the iterator are
intercepted by the duplicate detection code. This approach has the
primary advantage of restoring the complete state of the iteration
(including the ability to correctly detect subsequent duplicates) for
all iterators and multi-iterator methods without the need for
iterator-specific restart code. However, the possibility exists for
numerical round-off error to cause a divergence between the
evaluations performed in the previous and restarted studies. This has
been extremely rare to date.

\section{The Dakota Restart Utility}\label{restart:utility}

The Dakota restart utility program provides a variety of facilities
for managing restart files from Dakota executions. The executable
program name is \path{dakota_restart_util} and it has the
following options, as shown by the usage message returned when
executing the utility without any options:
\begin{footnotesize}
\begin{verbatim}
Usage:
  dakota_restart_util command <arg1> [<arg2> <arg3> ...] --options
    dakota_restart_util print <restart_file>
    dakota_restart_util to_neutral <restart_file> <neutral_file>
    dakota_restart_util from_neutral <neutral_file> <restart_file>
    dakota_restart_util to_tabular <restart_file> <text_file>
      [--custom_annotated [header] [eval_id] [interface_id]]
      [--output_precision <int>]
    dakota_restart_util remove <double> <old_restart_file> <new_restart_file>
    dakota_restart_util remove_ids <int_1> ... <int_n> <old_restart_file> <new_restart_file>
    dakota_restart_util cat <restart_file_1> ... <restart_file_n> <new_restart_file>
options:
  --help                       show dakota_restart_util help message
  --custom_annotated arg       tabular file options: header, eval_id, 
                               interface_id
  --freeform                   tabular file: freeform format
  --output_precision arg (=10) set tabular output precision
\end{verbatim}
\end{footnotesize}

Several of these functions involve format conversions. In particular,
the binary format used for restart files can be converted to ASCII
text and printed to the screen, converted to and from a neutral file
format, or converted to a tabular format for importing into 3rd-party
plotting/graphics programs. In addition, a restart file with corrupted
data can be repaired by value or id, and multiple restart files can be
combined to create a master database.

\subsection{Print}\label{restart:utility:print}

The \texttt{print} option outputs the contents of a particular restart
file in human-readable format, since the binary format is not
convenient for direct inspection. The restart data is printed in full
precision, so that (near-)exact matching of points is possible for
restarted runs or corrupted data removals. For example, the following
command
\begin{small}
\begin{verbatim}
    dakota_restart_util print dakota.rst
\end{verbatim}
\end{small}

results in output similar to the following (from the
example in Section~\ref{additional:cylinder}):
\begin{small}
\begin{verbatim}
    ------------------------------------------
    Restart record    1  (evaluation id    1):
    ------------------------------------------
    Parameters:
                          1.8000000000000000e+00 intake_dia
                          1.0000000000000000e+00 flatness

    Active response data:
    Active set vector = { 3 3 3 3 }
                         -2.4355973813420619e+00 obj_fn
                         -4.7428486677140930e-01 nln_ineq_con_1
                         -4.5000000000000001e-01 nln_ineq_con_2
                          1.3971143170299741e-01 nln_ineq_con_3
     [ -4.3644298963447897e-01  1.4999999999999999e-01 ] obj_fn gradient
     [  1.3855136437818300e-01  0.0000000000000000e+00 ] nln_ineq_con_1 gradient
     [  0.0000000000000000e+00  1.4999999999999999e-01 ] nln_ineq_con_2 gradient
     [  0.0000000000000000e+00 -1.9485571585149869e-01 ] nln_ineq_con_3 gradient

    ------------------------------------------
    Restart record    2  (evaluation id    2):
    ------------------------------------------
    Parameters:
                          2.1640000000000001e+00 intake_dia
                          1.7169994018008317e+00 flatness

    Active response data:
    Active set vector = { 3 3 3 3 }
                         -2.4869127192988878e+00 obj_fn
                          6.9256958799989843e-01 nln_ineq_con_1
                         -3.4245008972987528e-01 nln_ineq_con_2
                          8.7142207937157910e-03 nln_ineq_con_3
     [ -4.3644298963447897e-01  1.4999999999999999e-01 ] obj_fn gradient
     [  2.9814239699997572e+01  0.0000000000000000e+00 ] nln_ineq_con_1 gradient
     [  0.0000000000000000e+00  1.4999999999999999e-01 ] nln_ineq_con_2 gradient
     [  0.0000000000000000e+00 -1.6998301774282701e-01 ] nln_ineq_con_3 gradient

    ...<snip>...

    Restart file processing completed: 11 evaluations retrieved.
\end{verbatim}
\end{small}

\subsection{To/From Neutral File Format}\label{restart:utility:neutral}

A Dakota restart file can be converted to a neutral file format using
a command like the following:
\begin{small}
\begin{verbatim}
    dakota_restart_util to_neutral dakota.rst dakota.neu
\end{verbatim}
\end{small}
which results in a report similar to the following:
\begin{small}
\begin{verbatim}
    Writing neutral file dakota.neu
    Restart file processing completed: 11 evaluations retrieved.
\end{verbatim}
\end{small}

Similarly, a neutral file can be returned to binary format using a
command like the following:
\begin{small}
\begin{verbatim}
    dakota_restart_util from_neutral dakota.neu dakota.rst
\end{verbatim}
\end{small}
which results in a report similar to the following:
\begin{small}
\begin{verbatim}
    Reading neutral file dakota.neu
    Writing new restart file dakota.rst
    Neutral file processing completed: 11 evaluations retrieved.
\end{verbatim}
\end{small}

The contents of the generated neutral file are similar to the 
following (from the first two records for the
example in Section~\ref{additional:cylinder}):
\begin{small}
\begin{verbatim}
    6 7 2 1.8000000000000000e+00 intake_dia 1.0000000000000000e+00 flatness 0 0 0 0
    NULL 4 2 1 0 3 3 3 3 1 2 obj_fn nln_ineq_con_1 nln_ineq_con_2 nln_ineq_con_3
      -2.4355973813420619e+00 -4.7428486677140930e-01 -4.5000000000000001e-01
       1.3971143170299741e-01 -4.3644298963447897e-01  1.4999999999999999e-01
       1.3855136437818300e-01  0.0000000000000000e+00  0.0000000000000000e+00
       1.4999999999999999e-01  0.0000000000000000e+00 -1.9485571585149869e-01 1
    6 7 2 2.1640000000000001e+00 intake_dia 1.7169994018008317e+00 flatness 0 0 0 0
    NULL 4 2 1 0 3 3 3 3 1 2 obj_fn nln_ineq_con_1 nln_ineq_con_2 nln_ineq_con_3
      -2.4869127192988878e+00 6.9256958799989843e-01 -3.4245008972987528e-01
       8.7142207937157910e-03 -4.3644298963447897e-01  1.4999999999999999e-01
       2.9814239699997572e+01  0.0000000000000000e+00  0.0000000000000000e+00
       1.4999999999999999e-01  0.0000000000000000e+00 -1.6998301774282701e-01 2
\end{verbatim}
\end{small}

This format is not intended for direct viewing (\texttt{print} should
be used for this purpose). Rather, the neutral file capability has
been used in the past for managing portability of restart data across
platforms of dissimilar endianness of the computer architectures (e.g.
creator of the file was little endian but the need exists to run dakota with
restart on a big endian host. The neutral file format has also been
shown to be useful for for advanced repair of restart records
(in cases where the techniques of Section~\ref{restart:utility:removal} 
were insufficient).

\subsection{To Tabular Format}\label{restart:utility:tabular}

Conversion of a binary restart file to a tabular format enables
convenient import of this data into 3rd-party post-processing tools
such as Matlab, TECplot, Excel, etc. This facility is similar to the
\texttt{tabular\_data} option in the Dakota input file specification
(described in Section~\ref{output:tabular}), but with two important
differences:
\begin{enumerate}
\item No function evaluations are suppressed as they are with
  \texttt{tabular\_data} (i.e., any internal finite difference
  evaluations are included).
\item The conversion can be performed after Dakota completion, i.e.,
  for Dakota runs executed previously.
\end{enumerate}

An example command for converting a restart file to tabular format is:
\begin{verbatim}
    dakota_restart_util to_tabular dakota.rst dakota.m
\end{verbatim}
which results in a report similar to the following:
\begin{verbatim}
    Writing tabular text file dakota.m
    Restart file processing completed: 10 evaluations tabulated.
\end{verbatim}

The contents of the generated tabular file are similar to the following 
(from the example in Section~\ref{additional:textbook:examples:gradient2}).
Note that while evaluations resulting from numerical derivative offsets would be
reported (as described above), derivatives returned as part of the
evaluations are not reported (since they do not readily fit within a
compact tabular format):
\begin{footnotesize}
\begin{verbatim}
%eval_id interface             x1             x2         obj_fn nln_ineq_con_1 nln_ineq_con_2 
       1     NO_ID           0.9            1.1         0.0002           0.26           0.76 
       2     NO_ID    0.58256179   0.4772224441   0.1050555937   0.1007670171 -0.06353963386 
       3     NO_ID           0.5   0.4318131566   0.1667232695  0.03409342169 -0.06353739777 
       4     NO_ID           0.5   0.3695495062   0.2204806721  0.06522524692  -0.1134331625 
       5     NO_ID           0.5   0.3757758727   0.2143316122  0.06211206365  -0.1087924935 
       6     NO_ID           0.5   0.3695495062   0.2204806721  0.06522524692  -0.1134331625 
       7     NO_ID  0.5005468682  -0.5204065326    5.405888123   0.5107504335  0.02054952507 
       8     NO_ID  0.5000092554   0.4156974409   0.1790558059  0.04216053506 -0.07720026537 
       9     NO_ID   0.500000919   0.4302129149   0.1679019175   0.0348944616  -0.0649173074 
      10     NO_ID    0.50037519  -0.2214765079    2.288391116   0.3611135847  -0.2011357515 
...
 \end{verbatim}
\end{footnotesize}

\textbf{Controlling tabular format:} The command-line options
\texttt{--freeform} and \texttt{--custom\_annotated} give control of
headers in the resulting tabular file.  give control of headers in the
resulting tabular file.  Freeform will generate a tabular file with no
leading row nor columns (variable and response values only).  Custom
annotated format accepts any or all of the options:
\begin{itemize}
\item {\tt header}: include \%-commented header row with labels
\item {\tt eval\_id}: include leading column with evaluation ID
\item {\tt interface\_id}: include leading column with interface ID
\end{itemize}
For example, to recover Dakota 6.0 tabular format, which contained a
header row, leading column with evaluation ID, but no interface ID:
\begin{footnotesize}
\begin{verbatim}
dakota_restart_util to_tabular dakota.rst dakota.m --custom_annotated header eval_id
\end{verbatim}
\end{footnotesize}
Resulting in
\begin{footnotesize}
\begin{verbatim}
%eval_id             x1             x2         obj_fn nln_ineq_con_1 nln_ineq_con_2 
1                   0.9            1.1         0.0002           0.26           0.76 
2               0.90009            1.1 0.0001996404857   0.2601620081       0.759955 
3               0.89991            1.1 0.0002003604863   0.2598380081       0.760045 
...
\end{verbatim}
\end{footnotesize}

Finally, \texttt{--output\_precision <int>} will generate tabular
output with the specified integer digits of precision.

\subsection{Concatenation of Multiple Restart Files}\label{restart:utility:concatenation}

In some instances, it is useful to combine restart files into a single
master function evaluation database. For example, when constructing a
data fit surrogate model, data from previous studies can be pulled in
and reused to create a combined data set for the surrogate fit. An
example command for concatenating multiple restart files is:
\begin{small}
\begin{verbatim}
    dakota_restart_util cat dakota.rst.1 dakota.rst.2 dakota.rst.3 dakota.rst.all
\end{verbatim}
\end{small}
which results in a report similar to the following:
\begin{verbatim}
    Writing new restart file dakota.rst.all
    dakota.rst.1 processing completed: 10 evaluations retrieved.
    dakota.rst.2 processing completed: 110 evaluations retrieved.
    dakota.rst.3 processing completed: 65 evaluations retrieved.
\end{verbatim}

The \path{dakota.rst.all} database now contains 185 evaluations and
can be read in for use in a subsequent Dakota study using the
\texttt{-read\_restart} option to the \path{dakota} executable (see
Section~\ref{restart:management}).

\subsection{Removal of Corrupted Data}\label{restart:utility:removal}

On occasion, a simulation or computer system failure may cause a
corruption of the Dakota restart file. For example, a simulation crash
may result in failure of a post-processor to retrieve meaningful data.
If 0's (or other erroneous data) are returned from the user's
\texttt{analysis\_driver}, then this bad data will get recorded in the
restart file. If there is a clear demarcation of where corruption
initiated (typical in a process with feedback, such as gradient-based
optimization), then use of the \texttt{-stop\_restart} option for the
\path{dakota} executable can be effective in continuing the study
from the point immediately prior to the introduction of bad data. If,
however, there are interspersed corruptions throughout the restart
database (typical in a process without feedback, such as sampling),
then the \texttt{remove} and \texttt{remove\_ids} options of
\path{dakota_restart_util} can be useful.

An example of the command syntax for the \texttt{remove} option is:
\begin{small}
\begin{verbatim}
    dakota_restart_util remove 2.e-04 dakota.rst dakota.rst.repaired
\end{verbatim}
\end{small}
which results in a report similar to the following:
\begin{small}
\begin{verbatim}
    Writing new restart file dakota.rst.repaired
    Restart repair completed: 65 evaluations retrieved, 2 removed, 63 saved.
\end{verbatim}
\end{small}
where any evaluations in \path{dakota.rst} having an active response
function value that matches \texttt{2.e-04} within machine precision
are discarded when creating \path{dakota.rst.repaired}.

An example of the command syntax for the \texttt{remove\_ids} option is:
\begin{small}
\begin{verbatim}
    dakota_restart_util remove_ids 12 15 23 44 57 dakota.rst dakota.rst.repaired
\end{verbatim}
\end{small}
which results in a report similar to the following:
\begin{small}
\begin{verbatim}
    Writing new restart file dakota.rst.repaired
    Restart repair completed: 65 evaluations retrieved, 5 removed, 60 saved.
\end{verbatim}
\end{small}
where evaluation ids \texttt{12}, \texttt{15}, \texttt{23},
\texttt{44}, and \texttt{57} have been discarded when creating
\path{dakota.rst.repaired}. An important detail is that, unlike the
\texttt{-stop\_restart} option which operates on restart record numbers 
(see Section~\ref{restart:management})), the \texttt{remove\_ids}
option operates on evaluation ids.  Thus, removal is not necessarily
based on the order of appearance in the restart file. This distinction
is important when removing restart records for a run that contained
either asynchronous or duplicate evaluations, since the restart
insertion order and evaluation ids may not correspond in these cases
(asynchronous evaluations have ids assigned in the order of job
creation but are inserted in the restart file in the order of job
completion, and duplicate evaluations are not recorded which
introduces offsets between evaluation id and record number). This can
also be important if removing records from a concatenated restart
file, since the same evaluation id could appear more than once. In
this case, all evaluation records with ids matching the
\texttt{remove\_ids} list will be removed.

If neither of these removal options is sufficient to handle a
particular restart repair need, then the fallback position is to
resort to direct editing of a neutral file (refer to
Section~\ref{restart:utility:neutral}) to perform the necessary
modifications.
